What is R?
R is a programming language for statistical computing and graphics.
What is RStudio?
RStudio is an Integrated Development Environment (IDE) — a software that provides a user-friendly interface to write, organize, and run R code more easily.
Analogy
R is like a car engine and RStudio is like a car dashboard. :::
Components of RStudio
Console
The place where you can type and run R commands directly. It shows immediate results and error messages.
Source Editor
Where you write, edit, and save your R scripts or programs. You can run parts or all of your code from here.
Environment/History
Shows the data objects (like variables, data frames) currently in memory and keeps a history of commands you’ve run.
Plots
Displays graphs and charts created by your R code for visual data analysis.
Files/Packages/Help/Viewer
- Files: Manage your project files and folders.
- Packages: Install and load add-ons that extend R’s functionality.
- Help: Access documentation and help files.
- Viewer: Display web content or interactive visualizations within RStudio.
Terminal
A command-line interface where you can run system commands or use other programming languages alongside R.
Starting a Project in RStudio
Starting a Project
- Keeps files and work organized by project.
- Prevents mixing data or scripts from different analyses.
- Makes sharing and reproducing work easier.
What is the Working Directory?
- The folder where R reads and saves files by default.
- Acts as R’s “home folder” during your session.
How the Working Directory Works with Projects
- When you start a new project, RStudio sets the working directory to that project’s folder.
- Any files you create or export are saved there by default.
- This helps keep all your project files together and easy to find.
Setting the working directory
Packages and Functions in R
What is a Package?
A set of functions, data, or code designed to accomplish specific tasks
What is a Function?
Functions take input, do something, and return output
Think of it like a toolbox. A package is like a toolbox, and a function is a specific tool in the box.
Essential Packages
Package
Purpose
Example Functions
dplyr
Data wrangling
filter, select, mutate
tidyverse/tidyr
Data organization
pivot_longer, drop_na
ggplot2
Data Visualization
geom_point, facet_wrap
readr
Data Import
read_csv
Basics of Using a Function
In order to use a function, you have to feed it the information it needs to complete its task
To figure out what it needs, you can use the help function help(function_name) or its shortcut ?function_name. You can also click the “Help” tab
Here you will read about the package usage, descriptions arguments, details etc.
Making Your Very Own Function!
Writing your own functions helps increase efficiency if doing the same task over and over
Components of the function are the function name, input, code that explains what you want the function to do, and the output
Updated packages makes sure you are working with the latest versions which cna include fixes of old bugs Occasionally, if in the middle of an analysis or project, updated a package might not be desirable
Tools → Check for Package Updates → Select and update
OR
OR Use: update.packages(ask=FALSE) or install.packages(“package_name”, dependencies = TRUE, update = TRUE)
Data Types in R
Fundamental Data Types
In R, variables can be stored as several types of data
Different data types can do different things
Data Types We Might Use
Data Type
Example
Use
numeric
120, 34.2
blood pressures, bmi
integer
15, 26, 0003
steps, number of siblings, participant ID
character (string)
placebo, tall
randomization arm, category
logical (boolean)
TRUE, FALSE
survival, presence of health condition
# Numeric data: BMIbmi <-24.8class(bmi)
[1] "numeric"
# Integer data: Participant IDid <-0003Lclass(id)
[1] "integer"
# Character data: randomization grouptreatment_arm <-"placebo"class(treatment_arm)
[1] "character"
# Logical data: presence of hypertensionhas_hypertension <-FALSEclass(has_hypertension)
[1] "logical"
Why Data Types Matter in R
Knowing your data types helps prevent bugs and weird results
Functions behave differently depending on type
Mistakes often create errors (and that’s OK!)
# Numeric vs. character: Additionnum1 <-5num1 +2
[1] 7
num2 <-"5"num2 +2
Error in num2 + 2: non-numeric argument to binary operator
# Logical used in mathx <-TRUEy <-FALSEsum(c(x, y, TRUE))
[1] 2
# Sorting: numeric vs. characterages <-c(15, 9, 2)sort(ages)
[1] 2 9 15
as_char <-as.character(ages)sort(as_char)
[1] "15" "2" "9"
# Missing Data # Numeric NAnum_values <-c(100, NA, 200)mean(num_values)
Error in `$<-`:
! Assigned data `tb$ave + 5` must be compatible with existing data.
✖ Existing data has 3 rows.
✖ Assigned data has 0 rows.
ℹ Only vectors of size 1 are recycled.
Caused by error in `vectbl_recycle_rhs_rows()`:
! Can't recycle input of size 0 to size 3.
# Create and print normal data frame and tibble for comparisondf3 <-data.frame(id =1:24, name =rep(c("Ann", "Ben", "Cam"), 8))class(df3)
[1] "data.frame"
df3
id name
1 1 Ann
2 2 Ben
3 3 Cam
4 4 Ann
5 5 Ben
6 6 Cam
7 7 Ann
8 8 Ben
9 9 Cam
10 10 Ann
11 11 Ben
12 12 Cam
13 13 Ann
14 14 Ben
15 15 Cam
16 16 Ann
17 17 Ben
18 18 Cam
19 19 Ann
20 20 Ben
21 21 Cam
22 22 Ann
23 23 Ben
24 24 Cam
tb3 <-tibble(id =1:24, name =rep(c("Ann", "Ben", "Cam"), 8))class(tb3)
[1] "tbl_df" "tbl" "data.frame"
tb3
# A tibble: 24 × 2
id name
<int> <chr>
1 1 Ann
2 2 Ben
3 3 Cam
4 4 Ann
5 5 Ben
6 6 Cam
7 7 Ann
8 8 Ben
9 9 Cam
10 10 Ann
# ℹ 14 more rows
GitHub and CRAN
Source
Description
When to Use
Install With
CRAN
Official R package repository
Most stable and tested version
install.packages("pkgname")
GitHub
Developer’s source code (often in progress)
Get latest features or unreleased updates
devtools::install_github("user/pkgname")
Think of CRAN like the App Store — safe, reviewed, and stable.
Think of GitHub like the developer’s lab — early access, but maybe still being tested.
Install from CRAN install.packages(“ggplot2”)
Install from GitHub install.packages(“devtools”) devtools::install_github(“bhelsel/RLAB”)
GitHub packages may require additional setup or dependencies.